Part 1: Simulation with Metropolis-Hastings and Comparison
Task: Simulate a normal (Gaussian) distribution using the Metropolis-Hastings algorithm, targeting a mean (\(\mu\)) of 0 and variance (\(\sigma^2\)) of 1.
Approach: Implement the Metropolis-Hastings algorithm within the framework provided by the Elvis_simple.jl file.
Acceptance Criterion: Define the acceptance criterion based on the ratio of the target distribution probabilities and the proposal distribution probabilities.
Solution
The Metropolis-Hastings algorithm performs a sampling from a target distribution \(\pi\) by generating a Markov chain with a stationary distribution of \(\pi\).
My solution is based on a proposal distribution with mean 0 and variance 1.5, as seen below. I run the solution 10,000 times.
Defining the proposal distribution
The code in Julia below defines the proposal distribution based on a normal distribution with mean 0 and variance 2.
usingDistributionsusingRandomusingDataFramesfunctionmetropolis_hastings(n_samples::Int)# Initialize the chain chain =zeros(n_samples) current =randn() # Start at a random place# Target distribution target =Normal(0, 1.5)for i in2:n_samples proposal = current +randn() # Add some noise# Calculate acceptance probability p_accept =min(1, pdf(target, proposal) /pdf(target, current))# Accept or rejectifrand() < p_accept current = proposalend chain[i] = currentendreturn chainend# Generate 100,000 samplessamples =metropolis_hastings(100000)# Make it a dataframesamples_df =DataFrame(x= samples)
100000×1 DataFrame
99975 rows omitted
Row
x
Float64
1
0.0
2
0.307162
3
-1.45251
4
-1.45251
5
-0.754818
6
-2.49602
7
-1.85637
8
0.281805
9
0.00660573
10
0.287755
11
0.287755
12
1.0189
13
1.08958
⋮
⋮
99989
-0.114545
99990
-0.544227
99991
-1.42831
99992
-0.448857
99993
-0.644258
99994
-1.42279
99995
0.414336
99996
-0.142144
99997
-1.84527
99998
-1.84527
99999
-2.05246
100000
-1.42089
I plot the proposal distribution below to see how “normal” it looks. I use TidierPlots, the amazing Julia implementation of R’s ggplot2. I also overlay a standard normal distribution (randn()) to compare the two, where the standard normal distribution is the darker distribution overlaid in the histogram below.
height: 400
x: x
title: Metropolis-Hastings
width: 600
y: Frequency
geom_histogram
data: inherits from plot
x: not specified
geom_histogram
data: A DataFrame (100000 rows, 1 columns)
x: normal
The data looks pretty normal, however, it can noted that my sample has wider tails, probably due to the greater variance. Below, I also computed descriptive statistics using TidierData, the Julia implementation of R’s dplyr.
The mean is approximately 0, and the standard deviation is approximately 1.5, which are close to the target distribution’s \(\pi\) defined mean and standard deviation, so the Metropolis-Hastings algorithm is working well in this regard. A Q-Q plot is presented below to see how well it fits a normal distribution.
The data has heavier tails than a normal distribution, but it is still a good approximation. As mentioned the heavier tails probably come from the variance of 1.5 in the target distribution \(\pi\).